çŸä»£AIã®äžæ žãæ¢ããTransformerã®ã¢ãã³ã·ã§ã³ã¡ã«ããºã å®è£ ã«é¢ããå æ¬çã¬ã€ããçè«ããã³ãŒããŸã§ãScaled Dot-ProductãšMulti-Head Attentionãäžçã®éçºè ãæå¥œå®¶ã®ããã«è§£èª¬ããŸãã
Transformerã®è§£èªïŒã¢ãã³ã·ã§ã³ã¡ã«ããºã å®è£ ãžã®æ·±å±€çã¢ãããŒã
2017幎ã人工ç¥èœã®äžçã¯ãGoogle Brainã«ãããAttention Is All You Needããšé¡ãããäžæ¬ã®ç ç©¶è«æã«ãã£ãŠæ ¹æ¬çã«å€ããããŸããããã®è«æã¯ããããŸã§æ©æ¢°ç¿»èš³ã®ãããªã·ãŒã±ã³ã¹ããŒã¹ã®ã¿ã¹ã¯ãæ¯é ããŠããååž°åå±€ãç³ã¿èŸŒã¿å±€ãå®å šã«åãæã£ãæ°ããèšèšã§ããTransformerã¢ãŒããã¯ãã£ãå°å ¥ããŸããããã®é©åœã®äžå¿ã«ãã£ãã®ãã匷åãã€ãšã¬ã¬ã³ããªã³ã³ã»ããã§ããã¢ãã³ã·ã§ã³ã¡ã«ããºã ã§ããã
仿¥ãTransformerã¯ãGPT-4ãLLaMAã®ãããªå€§èŠæš¡èšèªã¢ãã«ãããã³ã³ãã¥ãŒã¿ããžã§ã³ãåµè¬ã«ãããç»æçãªã¢ãã«ãŸã§ãã»ãŒãã¹ãŠã®æå 端AIã¢ãã«ã®åºç€ãšãªã£ãŠããŸããã¢ãã³ã·ã§ã³ã¡ã«ããºã ãçè§£ããããšã¯ããã¯ãAIå®åè ã«ãšã£ãŠéžæè¢ã§ã¯ãªããäžå¯æ¬ ãªãã®ãšãªã£ãŠããŸãããã®å æ¬çãªã¬ã€ãã¯ãäžçã®éçºè ãããŒã¿ãµã€ãšã³ãã£ã¹ããAIæå¥œå®¶ã®ããã«èšèšãããŠããŸããç§ãã¡ã¯ã¢ãã³ã·ã§ã³ã¡ã«ããºã ãè§£ãæããããã®åºæ¬ååããå®è·µçãªã³ãŒãå®è£ ãŸã§ã解説ããŸããç§ãã¡ã®ç®æšã¯ãçŸä»£ã®AIãåãããšã³ãžã³ãçè§£ããæ§ç¯ããããã®çŽæãšæè¡çã¹ãã«ãçããã«æäŸããããšã§ãã
ã¢ãã³ã·ã§ã³ãšã¯äœãïŒæ®éçãªçŽæ
è¡åãæ°åŒã«é£ã³èŸŒãåã«ãæ®éçãªçŽæãæ§ç¯ããŸããããããªãããã®æãèªãã§ãããšæ³åããŠãã ããïŒãããã€ãã®åœé枯ããã®è²šç©ãç©ãã ãã®è¹ã¯ãæµ·ãã¹ã ãŒãºã«èªè¡ãããã
ãèªè¡ããããšããåèªã®æå³ãçè§£ããããã«ãããªãã®è³ã¯æäžã®ä»ã®ãã¹ãŠã®åèªã«åçãªéã¿ãäžããŠããããã§ã¯ãããŸãããæ¬èœçã«ã貚ç©ãããæž¯ãããããè¹ãããæµ·ãã«å€ãã®æ³šæãæããŸãããã®éžæçãªçŠç¹ãã€ãŸãç¹å®ã®èŠçŽ ãåŠçããéã«ç°ãªãæ å ±ã®éèŠæ§ãåçã«éã¿ä»ãããèœåããããã¢ãã³ã·ã§ã³ã®æ¬è³ªãªã®ã§ãã
AIã®æèã§ã¯ãã¢ãã³ã·ã§ã³ã¡ã«ããºã ã¯ã¢ãã«ã«åãããšããããŸããå ¥åã·ãŒã±ã³ã¹ã®äžéšïŒæäžã®åèªãç»åå ã®ããããªã©ïŒãåŠçããéãã¢ãã«ã¯ã·ãŒã±ã³ã¹å šäœãèŠãŠãçŸåšã®éšåãçè§£ããããã«ä»ã®ã©ã®éšåãæãé¢é£æ§ãé«ããã倿ã§ããŸããååž°çãªé£éãä»ããŠæ å ±ãé æ¬¡æž¡ãå¿ èŠãªããé·è·é¢ã®äŸåé¢ä¿ãçŽæ¥ã¢ãã«åã§ãããã®èœåããTransformerãéåžžã«åŒ·åãã€å¹ççã«ããŠããã®ã§ãã
ã³ã¢ãšã³ãžã³ïŒã¹ã±ãŒã«ãã»ããããããã¯ãã»ã¢ãã³ã·ã§ã³
Transformerã§äœ¿ãããæãäžè¬çãªã¢ãã³ã·ã§ã³ã®åœ¢åŒã¯ãã¹ã±ãŒã«ãã»ããããããã¯ãã»ã¢ãã³ã·ã§ã³ïŒScaled Dot-Product AttentionïŒãšåŒã°ããŸãããã®æ°åŒã¯äžèŠãããšåšå§çã«èŠãããããããŸããããç§ãã¡ã®çŽæã«èŠäºã«åèŽããäžé£ã®è«ççãªã¹ãããã§æ§ç¯ãããŠããŸãã
æ°åŒã¯æ¬¡ã®éãã§ãïŒAttention(Q, K, V) = softmax( (QKT) / √dk ) * V
ãããäžã€ãã€ã3ã€ã®äž»èŠãªå ¥åãã解説ããŠãããŸãããã
äžäœäžäœïŒã¯ãšãªãããŒãããªã¥ãŒ (Q, K, V)
ã¢ãã³ã·ã§ã³ãå®è£ ããããã«ãç§ãã¡ã¯å ¥åããŒã¿ïŒäŸïŒåèªåã蟌ã¿ïŒãã¯ãšãªãããŒãããªã¥ãŒãšãã3ã€ã®ç°ãªã衚çŸã«å€æããŸããããã¯ãããžã¿ã«ã©ã€ãã©ãªã§æ å ±ãæ€çŽ¢ãããããªãæ€çŽ¢ã·ã¹ãã ãšèããŠãã ããã
- ã¯ãšãª (Q): ããã¯çŸåšæ³šç®ããŠããé ç®ã衚ããŸããããªãã®ã質åãã§ããç¹å®ã®åèªã«ãšã£ãŠããã®ã¯ãšãªãã¯ãã«ã¯ãæã®æ®ãã®éšåã§ãç§ã«é¢é£ããæ å ±ã¯äœãïŒããšåããããŸãã
- ã㌠(K): ã·ãŒã±ã³ã¹å ã®åé ç®ã¯ããŒãã¯ãã«ãæã¡ãŸããããã¯æ å ±çã®ã©ãã«ãã¿ã€ãã«ããŸãã¯ããŒã¯ãŒãã®ãããªãã®ã§ããã¯ãšãªã¯ãã¹ãŠã®ããŒãšæ¯èŒãããæãé¢é£æ§ã®é«ããã®ãèŠã€ãåºããŸãã
- ããªã¥ãŒ (V): ã·ãŒã±ã³ã¹å ã®åé ç®ã¯ããªã¥ãŒãã¯ãã«ãæã¡ãŸããããã«ã¯å®éã®ã³ã³ãã³ããæ å ±ãå«ãŸããŠããŸããã¯ãšãªãæãäžèŽããããŒãèŠã€ããããããã«å¯Ÿå¿ããããªã¥ãŒãååŸããŸãã
Transformerã®ãšã³ã³ãŒããŒãšãã³ãŒããŒå ã§äœ¿çšãããã¡ã«ããºã ã§ããèªå·±æ³šææ©æ§ (self-attention) ã§ã¯ãã¯ãšãªãããŒãããªã¥ãŒã¯ãã¹ãŠåãå ¥åã·ãŒã±ã³ã¹ããçæãããŸããæäžã®ååèªã¯ã3ã€ã®å¥ã ã®åŠç¿ãããç·åœ¢å±€ãééããããšã§ãèªèº«ã®QãKãVãã¯ãã«ãçæããŸããããã«ãããã¢ãã«ã¯åãæäžã®ãã¹ãŠã®åèªãšä»ã®ãã¹ãŠã®åèªãšã®éã®ã¢ãã³ã·ã§ã³ãèšç®ããããšãã§ããŸãã
ã¹ãããããšã®å®è£ 解説
æ°åŒã®åæäœãããã®ç®çãšçµã³ã€ããªããèŠãŠãããŸãããã
ã¹ããã1ïŒé¡äŒŒåºŠã¹ã³ã¢ã®èšç® (Q * KT)
æåã®ã¹ãããã¯ãåã¯ãšãªãåããŒãšã©ãã ãæŽåããŠããããæž¬å®ããããšã§ããããã¯ããã¹ãŠã®ã¯ãšãªãã¯ãã«ãšãã¹ãŠã®ããŒãã¯ãã«ã®ãããç©ãèšç®ããããšã§å®çŸããŸããå®éã«ã¯ãããã¯ã·ãŒã±ã³ã¹å šäœã«å¯ŸããŠåäžã®è¡åä¹ç®ãã€ãŸã`Q`ãš`K`ã®è»¢çœ® (`K^T`) ã®ä¹ç®ãå¹ççã«è¡ãããšã§å®è¡ãããŸãã
- å ¥å: 圢ç¶ã`(sequence_length, d_q)`ã®ã¯ãšãªè¡å`Q`ãšã圢ç¶ã`(sequence_length, d_k)`ã®ããŒè¡å`K`ãæ³šæïŒ`d_q`ã¯`d_k`ãšçãããªããã°ãªããŸããã
- æäœ: `Q * K^T`
- åºå: 圢ç¶ã`(sequence_length, sequence_length)`ã®ã¢ãã³ã·ã§ã³ã¹ã³ã¢è¡åããã®è¡åã®`(i, j)`çªç®ã®èŠçŽ ã¯ã`i`çªç®ã®åèªïŒã¯ãšãªãšããŠïŒãš`j`çªç®ã®åèªïŒããŒãšããŠïŒã®éã®çã®é¡äŒŒåºŠã¹ã³ã¢ã衚ããŸããã¹ã³ã¢ãé«ãã»ã©ãé¢ä¿ã匷ãããšãæå³ããŸãã
ã¹ããã2ïŒã¹ã±ãŒãªã³ã° ( / √dk )
ããã¯éèŠãã€ã·ã³ãã«ãªå®å®åã¹ãããã§ããå ã®è«æã®èè ãã¡ã¯ãããŒã®æ¬¡å `d_k`ã倧ãããªããšããããç©ã®çµ¶å¯Ÿå€ãéåžžã«å€§ãããªãå¯èœæ§ãããããšãèŠåºããŸããããããã®å€§ããªæ°å€ãïŒæ¬¡ã®ã¹ãããã§ããïŒãœããããã¯ã¹é¢æ°ã«å ¥åããããšãåŸé ãéåžžã«å°ããé åã«æŒãããããå¯èœæ§ããããŸãããã®çŸè±¡ã¯åŸé æ¶å€±ãšããŠç¥ãããŠãããã¢ãã«ã®åŠç¿ãå°é£ã«ããå¯èœæ§ããããŸãã
ããã«å¯Ÿæãããããã¹ã³ã¢ãããŒãã¯ãã«ã®æ¬¡å ã®å¹³æ¹æ ¹ã√dkã§å²ã£ãŠã¹ã±ãŒã«ããŠã³ããŸããããã«ãããã¹ã³ã¢ã®åæ£ã1ã«ä¿ãããåŠç¿å šäœãéããŠããå®å®ããåŸé ã確ä¿ãããŸãã
ã¹ããã3ïŒãœããããã¯ã¹ã®é©çš (softmax(...))
ããã§ã¹ã±ãŒãªã³ã°ãããã¢ã©ã€ã¡ã³ãã¹ã³ã¢ã®è¡åãåŸãããŸãããããããã®ã¹ã³ã¢ã¯ä»»æã®å€ã§ããããããè§£éå¯èœã§æçšãªãã®ã«ããããã«ãåè¡ã«æ²¿ã£ãŠãœããããã¯ã¹é¢æ°ãé©çšããŸãããœããããã¯ã¹é¢æ°ã¯2ã€ã®ããšãè¡ããŸãïŒ
- ãã¹ãŠã®ã¹ã³ã¢ãæ£ã®æ°ã«å€æããŸãã
- åè¡ã®ã¹ã³ã¢ã®åèšã1ã«ãªãããã«æ£èŠåããŸãã
ãã®ã¹ãããã®åºåã¯ãã¢ãã³ã·ã§ã³ã®éã¿è¡åã§ããåè¡ã¯ç¢ºçååžã衚ãããã®è¡ã®äœçœ®ã«ããåèªãã·ãŒã±ã³ã¹å ã®ä»ã®ãã¹ãŠã®åèªã«ã©ãã ãæ³šæãæãã¹ããã瀺ããŸãããèªè¡ãããã®è¡ã«ãããè¹ããšããåèªã®éã¿ã0.9ã§ããå Žåããèªè¡ãããã®æ°ãã衚çŸãèšç®ããéã«ãæ å ±ã®90%ããè¹ãããæ¥ãããšãæå³ããŸãã
ã¹ããã4ïŒå éåã®èšç® ( * V )
æåŸã®ã¹ãããã¯ããããã®ã¢ãã³ã·ã§ã³ã®éã¿ã䜿ã£ãŠãååèªã®æ°ãããæèãèæ ®ãã衚çŸãäœæããããšã§ããããã¯ãã¢ãã³ã·ã§ã³ã®éã¿è¡åãšããªã¥ãŒè¡å`V`ãä¹ç®ããããšã«ãã£ãŠè¡ããŸãã
- å ¥å: ã¢ãã³ã·ã§ã³ã®éã¿è¡å`(sequence_length, sequence_length)`ãšããªã¥ãŒè¡å`V` `(sequence_length, d_v)`ã
- æäœ: `weights * V`
- åºå: 圢ç¶ã`(sequence_length, d_v)`ã®æçµçãªåºåè¡åã
ååèªïŒåè¡ïŒã«ã€ããŠããã®æ°ãã衚çŸã¯ãã·ãŒã±ã³ã¹å ã®ãã¹ãŠã®ããªã¥ãŒãã¯ãã«ã®å éåãšãªããŸããã¢ãã³ã·ã§ã³ã®éã¿ã倧ããåèªã»ã©ããã®åãžã®è²¢ç®åºŠãé«ããªããŸãããã®çµæãååèªã®ãã¯ãã«ãåã«ããèªèº«ã®æå³ã ãã§ãªãããããæ³šæãæã£ãåèªã®æå³ãšæ··ããåã£ãåã蟌ã¿ã®ã»ãããåŸãããŸããããã¯ä»ãæèã§è±ãã«ãªã£ãŠããŸãã
å®è·µçãªã³ãŒãäŸïŒPyTorchã«ããã¹ã±ãŒã«ãã»ããããããã¯ãã»ã¢ãã³ã·ã§ã³ã®å®è£
çè«ã¯å®è·µãéããŠæãããçè§£ãããŸãã以äžã¯ããã£ãŒãã©ãŒãã³ã°ã§äººæ°ã®ãã¬ãŒã ã¯ãŒã¯ã§ããPythonãšPyTorchã©ã€ãã©ãªã䜿çšãããã¹ã±ãŒã«ãã»ããããããã¯ãã»ã¢ãã³ã·ã§ã³ã¡ã«ããºã ã®ã·ã³ãã«ã§ã³ã¡ã³ãä»ãã®å®è£ ã§ãã
import torch
import torch.nn as nn
import math
class ScaledDotProductAttention(nn.Module):
""" Implements the Scaled Dot-Product Attention mechanism. """
def __init__(self):
super(ScaledDotProductAttention, self).__init__()
def forward(self, q, k, v, mask=None):
# q, k, v must have the same dimension d_k = d_v = d_model / h
# In practice, these tensors will also have a batch dimension and head dimension.
# For clarity, let's assume shape [batch_size, num_heads, seq_len, d_k]
d_k = k.size(-1) # Get the dimension of the key vectors
# 1. Calculate Similarity Scores: (Q * K^T)
# Matmul for the last two dimensions: (seq_len, d_k) * (d_k, seq_len) -> (seq_len, seq_len)
scores = torch.matmul(q, k.transpose(-2, -1))
# 2. Scale the scores
scaled_scores = scores / math.sqrt(d_k)
# 3. (Optional) Apply mask to prevent attention to certain positions
# The mask is crucial in the decoder to prevent attending to future tokens.
if mask is not None:
# Fills elements of self tensor with -1e9 where mask is True.
scaled_scores = scaled_scores.masked_fill(mask == 0, -1e9)
# 4. Apply Softmax to get attention weights
# Softmax is applied on the last dimension (the keys) to get a distribution.
attention_weights = torch.softmax(scaled_scores, dim=-1)
# 5. Compute the Weighted Sum: (weights * V)
# Matmul for the last two dimensions: (seq_len, seq_len) * (seq_len, d_v) -> (seq_len, d_v)
output = torch.matmul(attention_weights, v)
return output, attention_weights
ã¬ãã«ã¢ããïŒãã«ããããã¢ãã³ã·ã§ã³
ã¹ã±ãŒã«ãã»ããããããã¯ãã»ã¢ãã³ã·ã§ã³ã¡ã«ããºã ã¯åŒ·åã§ãããéçããããŸããããã¯åäžã®ã¢ãã³ã·ã§ã³ã®éã¿ã»ãããèšç®ãããããçŠç¹ãå¹³ååããããåŸãŸãããåäžã®ã¢ãã³ã·ã§ã³ã¡ã«ããºã ã¯ãäŸãã°äž»èªãšåè©ã®é¢ä¿ã«çŠç¹ãåœãŠãããšãåŠç¿ãããããããŸãããã代åè©ãšå è¡è©ã®é¢ä¿ããæäœçãªãã¥ã¢ã³ã¹ãšãã£ãä»ã®é¢ä¿ã«ã€ããŠã¯ã©ãã§ããããïŒ
ããã§ãã«ããããã¢ãã³ã·ã§ã³ãç»å ŽããŸããåäžã®ã¢ãã³ã·ã§ã³èšç®ãè¡ã代ããã«ãã¢ãã³ã·ã§ã³ã¡ã«ããºã ãè€æ°å䞊åã«å®è¡ãããã®çµæãçµã¿åãããŸãã
ããªããïŒå€æ§ãªé¢ä¿æ§ã®ææ
äžäººã®ãžã§ãã©ãªã¹ãã§ã¯ãªããå°éå®¶ã®å§å¡äŒãæã€ãããªãã®ã ãšèããŠãã ããããã«ããããã¢ãã³ã·ã§ã³ã®åãããããã¯ãå ¥åããŒã¿ã®ç°ãªãçš®é¡ã®é¢ä¿æ§ãåŽé¢ã«çŠç¹ãåœãŠãããšãåŠç¿ããå°éå®¶ãšèããããšãã§ããŸãã
ããã®åç©ã¯ããŸãã«ãç²ããŠããã®ã§ãéãæž¡ããªãã£ããããšããæã§ã代åè©ãitãããanimalããæãå ŽåãèããŸããïŒè𳿳šïŒåæã® "The animal didn't cross the street because it was too tired," ãèªç¶ãªæ¥æ¬èªã«ããŸããïŒ
- ããã1ã¯ã代åè©ãitãããã®å è¡è©ã§ãããanimalãã«çµã³ã€ããããšãåŠç¿ãããããããŸããã
- ããã2ã¯ããæž¡ããªãã£ãããšãç²ããŠãããã®éã®å æé¢ä¿ãåŠç¿ãããããããŸããã
- ããã3ã¯ãåè©ãwasããšãã®äž»èªãitãã®éã®æ§æçãªé¢ä¿ãæãããããããŸããã
è€æ°ã®ãããïŒå ã®Transformerè«æã§ã¯8ã€äœ¿çšïŒãæã€ããšã§ãã¢ãã«ã¯ããŒã¿å ã®è±å¯ãªçš®é¡ã®æ§æçããã³æå³çé¢ä¿ãåæã«æããããšãã§ããã¯ããã«ãã¥ã¢ã³ã¹è±ãã§åŒ·åãªè¡šçŸã«ã€ãªãããŸãã
ãã©ã®ããã«ãïŒåå²ãã¢ãã³ããé£çµãå°åœ±
ãã«ããããã¢ãã³ã·ã§ã³ã®å®è£ ã¯ã4ã€ã®ã¹ãããã®ããã»ã¹ã«åŸããŸãïŒ
- ç·åœ¢å°åœ±: å ¥ååã蟌ã¿ã¯3ã€ã®å¥ã ã®ç·åœ¢å±€ãééããåæã®ã¯ãšãªãããŒãããªã¥ãŒè¡åãäœæãããŸãããããã¯æ¬¡ã«`h`åã®ããå°ããªéšåïŒåãããã«1ã€ïŒã«åå²ãããŸããäŸãã°ãã¢ãã«ã®æ¬¡å `d_model`ã512ã§ã8ã€ã®ããããããå Žåãåãããã¯æ¬¡å 64ïŒ512 / 8ïŒã®QãKãVãã¯ãã«ã§åäœããŸãã
- 䞊åã¢ãã³ã·ã§ã³: å ã«èª¬æããã¹ã±ãŒã«ãã»ããããããã¯ãã»ã¢ãã³ã·ã§ã³ã¡ã«ããºã ãã`h`åã®QãKãVã®éšå空éã®ããããã«ç¬ç«ããŠäžŠåã«é©çšãããŸããããã«ããã`h`åã®å¥ã ã®ã¢ãã³ã·ã§ã³åºåè¡åãåŸãããŸãã
- é£çµ: `h`åã®åºåè¡åã¯ãåã³é£çµãããŠåäžã®å€§ããªè¡åã«ãªããŸããæã ã®äŸã§ã¯ããµã€ãº64ã®8ã€ã®è¡åãé£çµãããŠããµã€ãº512ã®1ã€ã®è¡åã圢æãããŸãã
- æçµå°åœ±: ãã®é£çµãããè¡åã¯ãæåŸã®ç·åœ¢å±€ãééããŸãããã®å±€ã«ãããã¢ãã«ã¯ç°ãªããããã«ãã£ãŠåŠç¿ãããæ å ±ãæãããçµã¿åãããæ¹æ³ãåŠç¿ããçµ±äžãããæçµåºåãçæããŸãã
ã³ãŒãå®è£ ïŒPyTorchã«ãããã«ããããã¢ãã³ã·ã§ã³
以åã®ã³ãŒããåºã«ã以äžã«ãã«ããããã¢ãã³ã·ã§ã³ãããã¯ã®æšæºçãªå®è£ ã瀺ããŸãã
class MultiHeadAttention(nn.Module):
""" Implements the Multi-Head Attention mechanism. """
def __init__(self, d_model, num_heads):
super(MultiHeadAttention, self).__init__()
assert d_model % num_heads == 0, "d_model must be divisible by num_heads"
self.d_model = d_model
self.num_heads = num_heads
self.d_k = d_model // num_heads
# Linear layers for Q, K, V and the final output
self.W_q = nn.Linear(d_model, d_model)
self.W_k = nn.Linear(d_model, d_model)
self.W_v = nn.Linear(d_model, d_model)
self.W_o = nn.Linear(d_model, d_model)
self.attention = ScaledDotProductAttention()
def forward(self, q, k, v, mask=None):
batch_size = q.size(0)
# 1. Apply linear projections
q, k, v = self.W_q(q), self.W_k(k), self.W_v(v)
# 2. Reshape for multi-head attention
# (batch_size, seq_len, d_model) -> (batch_size, num_heads, seq_len, d_k)
q = q.view(batch_size, -1, self.num_heads, self.d_k).transpose(1, 2)
k = k.view(batch_size, -1, self.num_heads, self.d_k).transpose(1, 2)
v = v.view(batch_size, -1, self.num_heads, self.d_k).transpose(1, 2)
# 3. Apply attention on all heads in parallel
context, _ = self.attention(q, k, v, mask=mask)
# 4. Concatenate heads and apply final linear layer
# (batch_size, num_heads, seq_len, d_k) -> (batch_size, seq_len, num_heads, d_k)
context = context.transpose(1, 2).contiguous()
# (batch_size, seq_len, num_heads, d_k) -> (batch_size, seq_len, d_model)
context = context.view(batch_size, -1, self.d_model)
output = self.W_o(context)
return output
ã°ããŒãã«ãªåœ±é¿ïŒãªããã®ã¡ã«ããºã ã¯ã²ãŒã ãã§ã³ãžã£ãŒãªã®ã
ã¢ãã³ã·ã§ã³ã®ååã¯ãèªç¶èšèªåŠçã«éå®ããããã®ã§ã¯ãããŸããããã®ã¡ã«ããºã ã¯ãæ°å€ãã®é åã§å€çšéãã€åŒ·åãªããŒã«ã§ããããšã蚌æãããŠãããäžçèŠæš¡ã§ã®é²æ©ãæšé²ããŠããŸãã
- èšèªã®å£ãæã¡ç Žã: æ©æ¢°ç¿»èš³ã«ãããŠãã¢ãã³ã·ã§ã³ã¯ã¢ãã«ãç°ãªãèšèªã®åèªéã«çŽæ¥çã§éç·åœ¢ãªå¯Ÿå¿ä»ããäœæããããšãå¯èœã«ããŸããäŸãã°ããã©ã³ã¹èªã®ãla voiture bleueããè±èªã®ãthe blue carãã«æ£ãã察å¿ããã圢容è©ã®é 眮ã®éããå·§ã¿ã«åŠçã§ããŸãã
- æ€çŽ¢ãšèŠçŽã匷åãã: é·ãææžãèŠçŽããããããã«é¢ãã質åã«çããããããããªã¿ã¹ã¯ã§ã¯ãèªå·±æ³šææ©æ§ã«ãã£ãŠãã¢ãã«ã¯æãæŠå¿µéã®è€éãªé¢ä¿æ§ã®ç¶²ãçè§£ããæãéèŠãªæãæŠå¿µãç¹å®ããããšãã§ããŸãã
- ç§åŠãšå»çã®é²æ©: ããã¹ããè¶ ããŠãã¢ãã³ã·ã§ã³ã¯ç§åŠããŒã¿ã«ãããè€éãªçžäºäœçšãã¢ãã«åããããã«äœ¿çšãããŠããŸããã²ãã ç§åŠã§ã¯ãDNAéå ã®é¢ããå¡©åºå¯Ÿéã®äŸåé¢ä¿ãã¢ãã«åã§ããŸããåµè¬ã§ã¯ãã¿ã³ãã¯è³ªéã®çžäºäœçšãäºæž¬ããã®ã«åœ¹ç«ã¡ãæ°ããæ²»çæ³ã®ç ç©¶ãå éãããŠããŸãã
- ã³ã³ãã¥ãŒã¿ããžã§ã³ã«é©åœãèµ·ãã: Vision Transformer (ViT)ã®ç»å Žã«ãããã¢ãã³ã·ã§ã³ã¡ã«ããºã ã¯ä»ãçŸä»£ã®ã³ã³ãã¥ãŒã¿ããžã§ã³ã®åºç€ãšãªã£ãŠããŸããç»åããããã®ã·ãŒã±ã³ã¹ãšããŠæ±ãããšã§ãèªå·±æ³šææ©æ§ã¯ã¢ãã«ãç»åã®ç°ãªãéšåéã®é¢ä¿ãçè§£ããããšãå¯èœã«ããç»ååé¡ãç©äœæ€åºã§æå ç«¯ã®æ§èœããããããŠããŸãã
çµè«ïŒæªæ¥ã¯ã¢ãã³ã·ã§ã³ã«ãã
çŠç¹ãšããçŽæçãªæŠå¿µããããã«ããããã¢ãã³ã·ã§ã³ã®å®çšçãªå®è£ ãŸã§ã®éã®ãã¯ã匷åã§ãããªãããéåžžã«è«ççãªã¡ã«ããºã ãæããã«ããŸããããã¯AIã¢ãã«ãæ å ±ã峿 Œãªã·ãŒã±ã³ã¹ãšããŠã§ã¯ãªããæè»ã§çžäºæ¥ç¶ãããé¢ä¿æ§ã®ãããã¯ãŒã¯ãšããŠåŠçããããšãå¯èœã«ããŸãããTransformerã¢ãŒããã¯ãã£ã«ãã£ãŠå°å ¥ããããã®èŠç¹ã®è»¢æã¯ãAIã«ãããåäŸã®ãªãèœåãè§£ãæŸã¡ãŸããã
ã¢ãã³ã·ã§ã³ã¡ã«ããºã ã®å®è£ æ¹æ³ãšè§£éæ¹æ³ãçè§£ããããšã§ãããªãã¯çŸä»£AIã®åºæ¬çãªæ§æèŠçŽ ãææ¡ããŠããããšã«ãªããŸããç ç©¶ãé²åãç¶ããã«ã€ããŠãæ°ããããå¹ççãªã¢ãã³ã·ã§ã³ã®å€çš®ãééããªãç»å Žããã§ãããããæãéèŠãªãã®ã«éžæçã«çŠç¹ãåœãŠããšããäžå¿çãªååã¯ãããç¥çã§æèœãªã·ã¹ãã ãæ±ããç¶ç¶çãªæ¢æ±ã®äžã§äžå¿çãªããŒãã§ããç¶ããã§ãããã